Search CORE

19 research outputs found

The need for open source software in machine learning

Author: Bengio Samy
Bottou Leon
Braun Mikio L
Holmes Geoffrey
LeCun Yann
Mueller Klaus-Robert
Ong Cheng Soon
Pereira Fernando
Raetsch Gunnar
Rasmussen Carl E
Schoelkopf Bernhard
Smola Alexander
Sonnenburg Soren
Vincent Pascal
Weston Jason
Williamson Robert
Publication venue: 'MIT Press - Journals'
Publication date: 09/12/2015
Field of study

Open source tools have recently reached a level of maturity which makes them suitable for building large-scale real-world systems. At the same time, the field of machine learning has developed a large body of powerful learning algorithms for diverse applications. However, the true potential of these methods is not used, since existing implementations are not openly shared, resulting in software with low usability, and weak interoperability. We argue that this situation can be significantly improved by increasing incentives for researchers to publish their software under an open source model. Additionally, we outline the problems authors are faced with when trying to publish algorithmic implementations of machine learning methods. We believe that a resource of peer reviewed software accompanied by short articles would be highly valuable to both the machine learning and the general scientific community

The Australian National University

Assessing the gene regulatory landscape in 1,188 human tumors

Author: Brazma Alvis
Calabrese Claudia
Erkek Serap
Fonseca Nuno
Kahles Andre
Kilpinen-Barrett Leena Helena
Korbel Jan
Lehmann Kjong-Van
Liu Fenglin
Markowski Julia
PCAWG-3
Raetsch Gunnar
Schwarz Roland
Stegle Oliver
Urban Lara
Waszak Sebastian
Zhang Zemin
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2017
Field of study

Cancer is characterised by somatic genetic variation, but the effect of the majority of non-coding somatic variants and the interface with the germline genome are still unknown. We analysed the whole genome and RNA-seq data from 1,188 human cancer patients as provided by the Pan-cancer Analysis of Whole Genomes (PCAWG) project to map cis expression quantitative trait loci of somatic and germline variation and to uncover the causes of allele-specific expression patterns in human cancers. The availability of the first large-scale dataset with both whole genome and gene expression data enabled us to uncover the effects of the non-coding variation on cancer. In addition to confirming known regulatory effects, we identified novel associations between somatic variation and expression dysregulation, in particular in distal regulatory elements. Finally, we uncovered links between somatic mutational signatures and gene expression changes, including TERT and LMO2, and we explained the inherited risk factors in APOBEC-related mutational processes. This work represents the first large-scale assessment of the effects of both germline and somatic genetic variation on gene expression in cancer and creates a valuable resource cataloguing these effects

Repository for Publications and Research Data

MDC Repository

Integrated multi-omics reveals anaplerotic rewiring in methylmalonyl-CoA mutase deficiency

Author: Aebersold Ruedi
Baumgartner Matthias R
Bingisser Anna
Bonilla Ximena
Cherkaoui Sarah
Dermitzakis Emmanouil
Forny Merima
Forny Patrick
Frei Caroline
Froese D Sean
Goetze Sandra
Harshman Keith
Howald Cédric
Lamparter David
Morscher Raphael J
Pedrioli Patrick G A
Plessl Tanja
Poms Martin
Raetsch Gunnar
Shao Wenguang
Simmons Luke
Traversi Florian
van Drogen Audrey
Wollscheid Bernd
Xenarios Ioannis
Zamboni Nicola
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2022
Field of study

Multi-layered omics approaches can help define relationships between genetic factors, biochemical processes and phenotypes thus extending research of inherited diseases beyond identifying their monogenic cause 1. We implemented a multi-layered omics approach for the inherited metabolic disorder methylmalonic aciduria (MMA). We performed whole genome sequencing, transcriptomic sequencing, and mass spectrometry-based proteotyping from matched primary fibroblast samples of 230 individuals (210 affected, 20 controls) and related the molecular data to 105 phenotypic features. Integrative analysis identified a molecular diagnosis for 84% (177/210) of affected individuals, the majority (148) of whom had pathogenic variants in methylmalonyl-CoA mutase (MMUT). Untargeted analysis of all three omics layers revealed dysregulation of the TCA cycle and surrounding metabolic pathways, a finding that was further corroborated by multi-organ metabolomics of a hemizygous Mmut mouse model. Integration of phenotypic disease severity indicated downregulation of oxoglutarate dehydrogenase and upregulation of glutamate dehydrogenase, two proteins involved in glutamine anaplerosis of the TCA cycle. The relevance of disturbances in this pathway was supported by metabolomics and isotope tracing studies which showed decreased glutamine-derived anaplerosis in MMA. We further identified MMUT to physically interact with both, oxoglutarate dehydrogenase complex components and glutamate dehydrogenase providing evidence for a multi-protein metabolon that orchestrates TCA cycle anaplerosis. This study emphasizes the utility of a multi-modal omics approach to investigate metabolic diseases and highlights glutamine anaplerosis as a potential therapeutic intervention point in MMA. Take home message Combination of integrative multi-omics technologies with clinical and biochemical features leads to an increased diagnostic rate compared to genome sequencing alone and identifies anaplerotic rewiring as a targetable feature of the rare inborn error of metabolism methylmalonic aciduria

ZORA

Robustes boosting durch konvexe optimierung

Author: Raetsch Gunnar
Publication venue: Bonner Kollen Verlag
Publication date
Field of study

The Australian National University

An Introduction to Boosting and Leveraging

Author: Meir Ron
Raetsch Gunnar
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 12/12/2015
Field of study

We provide an introduction to theoretical and practical aspects of Boosting and Ensemble learning, providing a useful reference for researchers in the field of Boosting as well as for those seeking to enter this fascinating area of research. We begin with a short background concerning the necessary learning theoretical foundations of weak learners ahd their linear combinations. We then point out the useful connection between Boosting and the Theory of Optimization, which facilitates the understanding of Boosting and later on enables us to move on to new Boosting algorithms, applicable to a broad spectrum of problems. In order to increase the relevance of the paper to practitioners, we have added remarks, pseudo code, "tricks of the trade", and algorithmic considerations where appropriate. Finally, we illustrate the usefulness of Boosting algorithms by giving an overview of some existing applications. The main ideas are illustrated on the problem of binary classification, although several extensions are discussed

The Australian National University

Maximizing the margin with boosting

Author: Raetsch Gunnar
Warmuth Manfred
Publication venue: Springer
Publication date
Field of study

The Australian National University

On the convergence of leveraging

Author: Mika Sebastian
Raetsch Gunnar
Warmuth Manfred
Publication venue: 'MIT Press - Journals'
Publication date: 11/12/2015
Field of study

The Australian National University

Constructing boosting algorithms from SVMs: an application to one-class classification

Author: Mika Sebastian
Mueller Klaus-Robert
Raetsch Gunnar
Schoelkopf Bernhard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/12/2015
Field of study

We show via an equivalence of mathematical programs that a support vector (SV) algorithm can be translated into an equivalent boosting-like algorithm and vice versa. We exemplify this translation procedure for a new algorithm-one-class leveraging-starting from the one-class support vector machine (1-SVM). This is a first step toward unsupervised learning in a boosting framework. Building on so-called barrier methods known from the theory of constrained optimization, it returns a function, written as a convex combination of base hypotheses, that characterizes whether a given test point is likely to have been generated from the distribution underlying the training data. Simulations on one-class classification problems demonstrate the usefulness of our approach

The Australian National University